Searching Object-Relational DBMS Features for Improving Efficiency and Scalability of Decision Tree Algorithms
نویسندگان
چکیده
Memory-based decision tree algorithms, such as C4.5 and its derivatives, do not support scalability well as they depend on the available memory. If the dataset being classified is very large, the efficiency also suffers as a large size of cases has to be traversed many times. Fortunately, ObjectRelational DBMSes offer features (such as indexing, query optimizer, improved SQL and stored procedures) to speed up queries and business logics that could be used in implementing the algorithms. This paper presents features that would be useful in improving efficiency and scalability of the algorithms. On the last section, we also outline our propose technique that utilizes the features and the result of early experiments.
منابع مشابه
Comparison of Performance in Image Classification Algorithms of Satellite in Detection of Sarakhs Sandy zones
Extended abstract 1- Introduction Wind erosion as an “environmental threat” has caused serious problems in the world. Identifying and evaluating areas affected by wind erosion can be an important tool for managers and planners in the sustainable development of different areas. nowadays there are various methods in the world for zoning lands affected by wind erosion. One of the most important...
متن کاملThe Revolution in Database System Architecture
Database system architectures are undergoing revolutionary changes. Most importantly, algorithms and data are being unified by integrating programming languages with the database system. This gives an extensible object-relational system where nonprocedural relational operators manipulate object sets. Coupled with this, each DBMS is now a web service. This has huge implications for how we struct...
متن کاملUsing SQL primitives and parallel DB servers to speed up knowledge discovery in large relational databases
Efficiency is crucial in KDD (Knowledge Discovery in Databases), due to the huge amount of data stored in commercial databases. We argue that high efficiency in KDD can be achieved by combining two approaches, namely mapping KDD functionality onto standard DBMS operations and executing KDD tasks on a parallel SQL server. We propose generic KDD primitives which underly the candidate-rule evaluat...
متن کاملA New Hybrid Method for Improving the Performance of Myocardial Infarction Prediction
Abstract Introduction: Myocardial Infarction, also known as heart attack, normally occurs due to such causes as smoking, family history, diabetes, and so on. It is recognized as one of the leading causes of death in the world. Therefore, the present study aimed to evaluate the performance of classification models in order to predict Myocardial Infarction, using a feature selection method tha...
متن کاملImproving the Performance of Machine Learning Algorithms for Heart Disease Diagnosis by Optimizing Data and Features
Heart is one of the most important members of the body, and heart disease is the major cause of death in the world and Iran. This is why the early/on time diagnosis is one of the significant basics for preventing and reducing deaths of this disease. So far, many studies have been done on heart disease with the aim of prediction, diagnosis, and treatment. However, most of them have been mostly f...
متن کامل